Building a Time Series Data Product for Battery Analytics
A battery management system (BMS) generates a continuous stream of data — voltage, current, temperature, state of charge. On the surface, this looks like rich material for analytics. In practice, raw BMS telemetry is noisy, inconsistent, and full of gaps. It becomes useful only after a product team makes deliberate decisions about how to collect, clean, structure, and interpret it.
This post walks through those decisions. It is written for product managers building analytics features on top of battery data — whether for a battery energy storage system (BESS), an electric vehicle fleet, or an industrial uninterruptible power supply (UPS).
What the BMS Actually Produces
Before designing a data product, you need to know what your source data looks like. A BMS typically reports the following signals:
- Cell voltage — the voltage of each individual cell or group of cells (string). The most granular and diagnostically useful signal in the pack.
- Pack current — the total current flowing in or out of the battery. Positive during discharge, negative during charge (or vice versa, depending on convention).
- Temperature — measured at multiple points: cell level, module level, pack inlet and outlet, and sometimes ambient. A battery pack may have dozens of temperature sensors.
- State of Charge (SoC) — the BMS’s estimate of remaining capacity, expressed as a percentage. Calculated internally by the BMS, not measured directly.
- State of Health (SoH) — the BMS’s estimate of how much of the original capacity the battery retains. Degrades over time with cycling and ageing.
- Cycle count — the number of full charge-discharge cycles completed.
- Fault and alarm codes — status flags indicating cell overvoltage, undervoltage, over-temperature, communication errors, and other conditions.
- Internal resistance — some BMS units estimate this during charge or discharge. It is a key indicator of battery ageing.
PM implication: Not all BMS units report all of these signals. String-level cell voltage data is often available on premium BMS hardware but aggregated to pack-level on lower-cost units. Before committing to an analytics feature — for example, a cell-level voltage imbalance alert — confirm that the hardware in your target sites actually reports it. A feature built on data that does not exist in the field is not a feature; it is a gap.
Cadence Decisions: How Often to Sample
The rate at which you collect data has a direct impact on what analytics you can build, how much storage you consume, and what you can charge for.
| Cadence | Typical use case |
|---|---|
| 100 ms – 1 s | Fault detection, protection systems, control loops |
| 10 s – 60 s | Operational monitoring, real-time dashboards |
| 5 min – 15 min | Trend analytics, reporting, ML training data |
| 1 hour+ | Long-term degradation tracking, billing |
A common mistake is collecting everything at the highest possible rate. This creates storage and processing costs that grow quickly as the fleet scales, while providing little additional analytical value for most use cases. A battery’s state of health does not change meaningfully in one second.
A better approach is tiered sampling: collect high-frequency data for signals where speed matters (fault detection on cell voltage), and lower-frequency data for signals used in trend analytics (SoH, temperature over weeks).
PM implication: Cadence is a product decision, not just a technical one. It determines your storage costs, your data retention policy, and which analytics features are feasible. Define it explicitly in your requirements — “collect cell voltage at 10-second intervals, temperature at 60-second intervals” — rather than leaving it as an engineering assumption. Revisit it when you onboard new BMS hardware, because different vendors have different default reporting rates.
Data Quality at the Source
Raw BMS data is rarely clean. These are the most common problems you will encounter:
- Timestamp drift — the BMS clock may not be synchronised to a reliable time source. Timestamps from different devices on the same site can diverge by seconds or minutes, making it impossible to correlate signals accurately.
- Missing data — communication outages, gateway reboots, and network drops create gaps. A missing 10-minute window in an SoC trend is benign. A missing window during a fault event can be critical.
- Unit inconsistencies — different BMS vendors report voltage in millivolts or volts, current in amps or milliamps, temperature in Celsius or Fahrenheit. Without normalisation at ingestion, your analytics will silently produce wrong results.
- SoC estimation error — the BMS calculates SoC using algorithms that accumulate error over time, especially at low temperatures or high discharge rates. The value it reports may differ from actual capacity by several percentage points.
- Sensor faults — a failed temperature sensor reports a fixed value or zero. Without outlier detection, these values pass through to dashboards and models as if they were real readings.
PM implication: Data quality is not a data engineering concern that sits below the product layer. It determines the reliability of every feature above it. Define your data quality acceptance criteria the same way you define functional requirements: what percentage of readings may be missing before an alert fires? What timestamp tolerance is acceptable for cross-signal correlation? What happens in the UI when data quality falls below threshold — do you show stale data, a gap, or a warning? These are product decisions. Make them deliberately.
Feature Engineering for Degradation Signals
Raw signals tell you what the battery is doing right now. Engineered features tell you where it is heading.
These are the most useful features for battery degradation analytics:
Incremental Capacity Analysis (ICA) — the derivative of charge with respect to voltage (dQ/dV) during a slow, controlled charge. The shape of this curve changes predictably as the battery ages. Peaks shift and flatten as capacity fade progresses. ICA requires high-resolution voltage and current data collected during a consistent charge protocol.
Differential Voltage Analysis (DVA) — the derivative of voltage with respect to charge (dV/dQ). Similar diagnostic value to ICA, often easier to compute from field data. Detects lithium plating and electrode degradation.
Internal resistance growth — estimated by measuring the voltage drop across a known current pulse. Internal resistance grows with age and temperature exposure. Tracking it over time is one of the most reliable indicators of remaining battery life.
Temperature rise during charge — a battery with higher internal resistance generates more heat for the same charging current. Tracking the delta between cell temperature and ambient temperature during charge reveals resistance changes before the BMS reports them as a fault.
SoC estimation drift — if the BMS’s reported SoC drifts consistently away from the true SoC (measured by a reference charge), it indicates that the BMS model is no longer calibrated to the pack’s actual capacity. This itself is a degradation signal.
Depth of discharge (DoD) distribution — a statistical summary of how deeply the battery is cycled over time. Frequent deep discharges accelerate degradation. Knowing the DoD distribution allows you to model remaining useful life more accurately than cycle count alone.
PM implication: Feature engineering is where your data product creates differentiated value. A competitor can buy the same BMS hardware and collect the same raw signals. They cannot easily replicate your engineered features, your models trained on thousands of real-world cycles, or your interpretations of what those features mean for a specific battery chemistry. Treat your feature library as a product asset — document it, version it, and protect it.
Defining “Done” for an Analytics Feature
Analytics features are different from functional features. A button either works or it does not. An analytics feature exists on a spectrum — it can be more or less accurate, more or less timely, covering more or fewer assets in the fleet.
This makes “done” harder to define. Here is a framework for writing meaningful acceptance criteria for analytics features:
1. Accuracy threshold
What level of error is acceptable? For an SoH estimate, is ±5% acceptable, or does the use case require ±2%? Define this in terms your users understand — not just RMSE (root mean squared error) but what that error means in practice (e.g. “the remaining life estimate may be off by up to 6 months”).
2. Data coverage
What percentage of the fleet must have sufficient data for the feature to run? A degradation model that works on 60% of assets and silently fails on the other 40% is not a shippable feature — it is a prototype. Define a minimum coverage threshold and a fallback behaviour for assets below it.
3. Latency
How fresh does the insight need to be? An SoH report updated monthly is appropriate for asset owners making replacement decisions. An anomaly alert that fires 24 hours after the event is not useful for operations teams. Define the update frequency as part of the acceptance criteria.
4. Baseline comparison
How does the ML-based feature compare to the simplest possible alternative? If a rule-based threshold alert (e.g. “fire if cell voltage drops below 2.8V”) catches 90% of the faults that your ML model catches, the ML model needs to earn its complexity. Define what “better than baseline” means before you ship.
5. Monitoring in production
How will you know if the feature degrades after release? Models drift as battery chemistry changes, as operating patterns shift, and as new hardware is onboarded. Define what monitoring you will put in place — and who owns it — before the feature ships.
Questions Every PM Should Be Able to Answer
Before writing specs for a battery analytics feature, make sure you can answer these:
- What signals does the BMS report, and at what granularity? Cell-level, string-level, or pack-level only?
- Are timestamps reliable? Is the BMS clock synchronised to NTP or GPS? What is the acceptable drift tolerance?
- What cadence is data collected at, and is it consistent? Does the rate vary by site, by BMS vendor, or by operating mode?
- How are missing data gaps handled? Does the pipeline interpolate, flag, or drop them? What does the UI show?
- What is the SoC estimation method? Coulomb counting, open-circuit voltage, or a Kalman filter? Does it recalibrate, and how often?
- Has the model been validated on real field data? Or only on a controlled lab dataset with a single battery chemistry?
- Who owns model monitoring in production? Data engineering, data science, or the product team?
Final Thoughts
Raw BMS telemetry is the starting point, not the product. The product is what you build on top of it — the cadence decisions, the quality controls, the engineered features, and the acceptance criteria that define what “working” actually means.
The PMs who build the best battery analytics products are not the ones who let engineers decide these things by default. They are the ones who treat data collection, data quality, and feature definition as first-class product decisions — and make them deliberately, early, and in writing.
If you have not already read The IIoT Stack for Renewable Energy Sites, that post covers the infrastructure layer that sits beneath everything described here — from field protocols to cloud storage — and is a useful companion to this one.